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Abstract — We present a mathematical connection between 
channel coding and compressed sensing. In particular, we link, on 
the one hand, channel coding linear programming decoding ( CC- 
LPD), which is a well-known relaxation of maximum-likelihood 
channel decoding for binary linear codes, and, on the other 
hand, compressed sensing linear programming decoding (CS-LPD), 
also known as basis pursuit, which is a widely used linear 
programming relaxation for the problem of finding the sparsest 
solution of an under-determined system of linear equations. More 
specifically, we establish a tight connection between CS-LPD 
based on a zero-one measurement matrix over the reals and 
CC-LPD of the binary linear channel code that is obtained 
by viewing this measurement matrix as a binary parity-check 
matrix. This connection allows the translation of performance 
guarantees from one setup to the other. The main message of 
this paper is that parity-check matrices of "good" channel codes 
can be used as provably "good" measurement matrices under 
basis pursuit. In particular, we provide the first deterministic 
construction of compressed sensing measurement matrices with 
an order-optimal number of rows using high-girth low-density 
parity-check (LDPC) codes constructed by Gallager. 

Index Terms — Approximation guarantee, basis pursuit, chan- 
nel coding, compressed sensing, graph cover, linear programming 
decoding, pseudo-codeword, pseudo-weight, sparse approxima- 
tion, zero-infinity operator. 



I. Introduction 

RECENTLY, there has been substantial interest in the 
theory of recovering sparse approximations of signals 
that satisfy linear measurements. Compressed sensing research 
(see, for example lO, H) has developed conditions for mea- 
surement matrices under which (approximately) sparse signals 
can be recovered by solving a linear programming relaxation 
of the original NP-hard combinatorial problem. This linear 
programming relaxation is usually known as "basis pursuit." 

In particular, in one of the first papers in this area, cf. (3l|, 
Candes and Tao presented a setup they called "decoding by 
linear programming," henceforth called compressed sensing 
linear programming decoding (CS-LPD), where the sparse 
signal corresponds to real-valued noise that is added to a 
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real-valued signal that is to be recovered in a hypothetical 
communication problem. 

At about the same time, in an independent line of research, 
Feldman, Wainwright, and Karger considered the problem of 
decoding a binary linear code that is used for data commu- 
nication over a binary-input memoryless channel, a problem 
that is also NP-hard in general. In (S), lO, they formulated 
this channel coding problem as an integer linear program, 
along with presenting a linear programming relaxation for it, 
henceforth called channel coding linear programming decod- 
ing (CC-LPD). Several theoretical results were subsequently 
proven about the efficiency of CC-LPD, in particular for low- 
density parity-check (LDPC) codes (see, e.g., ifTl- lfTOl ). 

As we will see in the subsequent sections, CS-LPD and CC- 
LPD (and the setups they are derived from) look like similar 
linear programming relaxations, however, a priori it is rather 
unclear if there is a connection beyond this initial superficial 
similarity. The main technical difference is that CS-LPD is 
a relaxation of the objective function of a problem that is 
naturally over the reals while CC-LPD involves a polytope 
relaxation of a problem defined over a finite field. Indeed, 
Candes and Tao in their original paper asked the question [S] 
Section VI. A]; ". . . In summary, there does not seem to be any 
explicit known connection with this line of work ^] but 
it would perhaps be of future interest to explore if there is 
one." 

In this paper we present such a connection between CS- 
LPD and CC-LPD. The general form of our results is that 
if a given binary parity-check matrix is "good" for CC-LPD 
then the same matrix (considered over the reals) is a "good" 
measurement matrix for CS-LPD. The notion of a "good" 
parity-check matrix depends on which channel we use (and 
a corresponding channel-dependent quantity called pseudo- 
weight). 

• Based on results for the binary symmetric channel (BSC), 
we show that if a parity-check matrix can correct any k 
bit-flipping errors under CC-LPD, then the same matrix 
taken as a measurement matrix over the reals can be used 
to recover all fc-sparse error signals under CS-LPD. 

• Based on results for binary-input output-symmetric chan- 
nels with bounded log-likelihood ratios, we can extend 
the previous result to show that performance guarantees 
for CC-LPD for such channels can be translated into 
robust sparse-recovery guarantees in the £i / li sense (see, 
e.g., 1 11 1) for CS-LPD. 

• Performance guarantees for CC-LPD for the binary-input 
additive white Gaussian noise channel (AWGNC) can be 
translated into robust sparse-recovery guarantees in the 
(.2/ii sense for CS-LPD. 

• Max-fractional weight performance guarantees for CC- 
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LPD can be translated into robust sparse-recovery guar- 
antees in the too/^i sense for CS-LPD. 
• Performance guarantees for CC-LPD for the binary era- 
sure channel (BEC) can be translated into performance 
guarantees for the compressed sensing setup where the 
support of the error signal is known and the decoder tries 
to recover the sparse signal {i.e., tries to solve the linear 
equations) by back-substitution only. 
All our results are also valid in a stronger, point-wise sense. 
For example, for the BSC, if a parity-check matrix can recover 
a given set of k bit flips under CC-LPD, the same matrix will 
recover any sparse signal supported on those k coordinates 
under CS-LPD. In general, "good" performance of CC-LPD 
on a given error support set will yield "good" CS-LPD 
recovery for sparse signals supported by the same set. 

It should be noted that all our results are only one-way: 
we do not prove that a "good" zero-one measurement matrix 
will always be a "good" parity-check matrix for a binary code. 
This remains an interesting open problem. 

Besides these main results we also present reformulations 
of CC-LPD and CS-LPD in terms of so-called graph covers: 
these reformulations will help in seeing further similarities and 
differences between these two linear programming relaxations. 
Moreover, based on an operator that we will call the zero- 
infinity operator, we will define an optimization problem called 
CS-OPTo,oo, along with a relaxation of it called CS-RELo,oo- 
Let CS-OPT be the NP-hard combinatorial problem men- 
tioned at the beginning of the introduction whose relaxation is 
CS-LPD. First, we will show that CS-RELq.oo is equivalent 
to CS-LPD. Secondly, we will argue that the solution of 
CS-LPD is "closer" to the solution of CS-OPTq.oo than the 
solution of CS-LPD is to the solution of CS-OPT. This is 
interesting because CS-OPTq.oo is, like CS-OPT, in general 
an intractable optimization problem, and so CS-OPTq.oo is at 
least as justifiably as CS-OPT a difficult optimization problem 
whose solution is approximated by CS-LPD. 

The organization of this paper is as follows. In Section |ll] 
we set up the notation that will be used. Then, in Sections |lll] 
and|lV]we review the compressed sensing and channel coding 
problems, along with their respective linear programming 
relaxations. 

Section[V]is the heart of this paper: it establishes the lemma 
that will bridge CS-LPD and CC-LPD for zero-one matrices. 
Technically speaking, this lemma shows that non-zero vectors 
in the real nullspace of a measurement matrix (i.e., vectors 
that are problematic for CS-LPD) can be mapped to non-zero 
vectors in the fundamental cone defined by that same matrix 
{i.e., to vectors that are problematic for CC-LPD). 

Afterwards, in Section |VT] we use the previously developed 
machinery to establish the main results of this paper, namely 
the translation of performance guarantees from channel coding 
to compressed sensing. By relying on prior channel coding 
results llTOl . lUZj, L13J and the above-mentioned lemma, we 
present novel results on sparse compressed sensing matrices. 
Perhaps the most interesting corollary involves the sparse 
deterministic matrices constructed in Gallager's thesis lfT4l 
Appendix C]. In particular, by combining our translation 
results with a recent breakthrough by Arora et al. [13] we 



show that high-girth deterministic matrices can be used for 
compressed sensing to recover sparse signals. To the best of 
our knowledge, this is the first deterministic construction of 
measurement matrices with an order-optimal number of rows. 

Subsequently, Section I VII I tightens the connection between 
CC-LPD and CS-LPD with the help of graph covers, and Sec- 
tion IVIIII presents the above-mentioned results involving the 
zero-infinity operator. Finally, some conclusions are presented 
in Section BXl 

The appendices contain the longer proofs. Moreover, Ap- 
pendix |D] presents three generalizations of the bridge lemma 
{cf. Lemma [TT] in Section |V]i to certain types of integer and 
complex valued matrices. 

II. Basic Notation 

Let Z, Z^o, Z>o, K>o, K>o, C, and F2 be the ring of 
integers, the set of non-negative integers, the set of positive 
integers, the field of real numbers, the set of non-negative real 
numbers, the set of positive real numbers, the field of complex 
numbers, and the finite field of size 2, respectively. Unless 
noted otherwise, expressions, equalities, and inequalities will 
be over the field M. The absolute value of a real number a 
will be denoted by \a\. 

The size of a set S will be denoted by \S\. For any M S 
Z>Q, we define the set [M] = {!,..., M}. 

All vectors will be column vectors. If a is some vector with 
integer entries, then a (mod 2) will denote an equally long 
vector whose entries are reduced modulo 2. If 5 is a subset 
of the set of coordinate indices of a vector a then ag is the 
vector with \S\ entries that contains only the coordinates of 
a whose coordinate index appears in S. Moreover, if a is a 
real vector then we define \a\ to be the real vector a' with the 
same number of components as a and with entries a[ — \ai\ 
for all i. Finally, the inner product (a, b) of two equally long 
vectors a and b is written (a, b) = aibi. 

We define supp(a) = {« | ^ 0} to be the support set 
of some vector a. Moreover, we let Sg„ = |a e M" | 
|supp(a)| k} and E^^^ = {a e \ |supp(a)| < k} 
be the set of vectors in R" and Fj, respectively, which have 
at most k non-zero components. We refer to vectors in these 
sets as fc-sparse vectors. 

For any real vector a, we define ||a||o to be the £q 
norm of a, i.e., the number of non-zero components of a. 
Note that ||a||o ~ wii{a) — |supp(a)|, where WB_{a) is 
the Hamming weight of a. Furthermore, ||a||i = 
Il"ll2 = and ll^^lloo = max,;|ai| wifl denote, 

respectively, the £1, £2, and ^oo norms of a. 

For a matrix M over R with n columns we denote its R- 
nullspace by NullspR(i?) = {a e R" | M • a = 0} and for a 
matrix M over F2 with n columns we denote its F2 -nullspace 
by Nullspj.^(i?) = {aG¥^\ M ■a = (mod 2)}. 

Let H — {hj,i)j^i be some matrix. We denote the set of row 
and column indices of H by J{H) and I{H), respectively. 
We will also use the sets J^{H) ^ {j ^ J \ hj^, ^ 0}, 
i e I{H), and Ij{H) ^ {i e X \ ^ 0}, j € J(i?). 
Moreover, for any set S C I{H), we will denote its comple- 
ment with respect to T{H) by S, i.e., S = I{H) \ S. In the 
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following, when no confusion can arise, we will sometimes 
omit the argument H in the preceding expressions. 

Finally, for any n, M e Z>o and any vector a e C", 
we define the Af-fold lifting of a to be the vector a^^^ = 
("(i!m))(i,™) ^ components given by 

(One can think of a^*^ as the Kronecker product of the vector 
a with the all-one vector with M components.) Moreover, for 
any vector a = (a(,^„) S C*^" or a = (a(,,„) G 
¥2^" we define the projection of a to the space to be the 
vector a = (pM{a,) with components given by 

me[M] 

(In the case where d is over F2, the summation is over C and 
we use the standard embedding of {0, 1} into C.) 

III. Compressed Sensing 
Linear Programming Decoding 

A. The Setup 

Let i?cs be a real matrix of size mxn, called the measure- 
ment matrix, and let s be a real-valued vector containing m 
measurements. In its simplest form, the compressed sensing 
problem consists of finding the sparsest real vector e' with n 
components that satisfies J?cs ■ e' — s, namely 



CS-OPT : minimize ||e'||o 

subject to Hcs ■ e' — s. 



Assuming that there exists a sparse signal e that satisfies 
the measurement .ffcs e. = s, CS-OPT yields, for suitable 
matrices -ffcs. an estimate e that equals e. 

This problem can also be interpreted [31 as part of the 
decoding problem that appears in a coded data communicating 
setup where the channel input alphabet is Xcs — the 
channel output alphabet is 3^cs = and the information 
symbols are encoded with the help of a real-valued code Cos 
of block length n and dimension k ^ n — rankR(H'cs) as 
follows. 

• The code is Cos — {x £ R" | Hqs • cc = O}. Because 
of this, the measurement matrix Hcs is sometimes also 
called an annihilator matrix. 

• A matrix Gqs e R"^" for which Ccs = {Gcs ■u\u e 
M*^} is called a generator matrix for the code Ccs- With 
the help of such a matrix, information vectors it € R*^ 
are encoded into codewords x E M" according to a; = 
Gcs • u. 

• Let y E J^gg be the received vector. We can write y = 
X + e for a suitably defined vector e E R", which will 
be called the error vector. We initially assume that the 
channel is such that e is sparse, i.e., that the number 
of non-zero entries is bounded by some positive integer 
k. This will be generalized later to channels where the 



vector e is approximately sparse, i.e., where the number 
of large entries is bounded by some positive integer k. 
• The receiver first computes the syndrome vector s ac- 
cording to s = Hcs • y- Note that 

s = Hcs ■ {x + e) = Hcs ■ x + Hcs ■ e 
= Hcs ■ e. 

In a second step, the receiver solves CS-OPT to obtain 
an estimate e for e, which can be used to obtain the 
codeword estimate x = y — e, which in turn can be used 
to obtain the information word estimate u. 
Because the complexity of solving CS-OPT is usually 
exponential in the relevant parameters, one can try to formulate 
and solve a related optimization problem with the aim that 
the related optimization problem yields very often the same 
solution as CS-OPT, or at least very often a very good 
approximation to the solution given by CS-OPT. In the 
context of CS-OPT, a popular approach is to formulate and 
solve the following related optimization problem (which, with 
the suitable introduction of auxiliary variables, can be turned 
into a linear program): 



CS-LPD : minimize ||e'||i 

subject to Hcs ■ e' = s. 



This relaxation is also known as basis pursuit. 

B. Conditions for the Equivalence of CS-LPD and CS-OPT 

A central question of compressed sensing theory is under 
what conditions the solution given by CS-LPD equals (or is 
very close to) the solution given by CS-OPtQ 

Clearly, if to ^ n and the matrix Hcs has rank n, there 
is only one feasible e' and the two problems have the same 
solution. 

In this paper we typically focus on the linear sparsity 
regime, i.e., k = Q{n) and m — &{n), but our techniques 
are more generally applicable. The question is for which 
measurement matrices (hopefully with a small number of 
measurements m) the LP relaxation is tight, i.e., the estimate 
given by CS-LPD equals the estimate given by CS-OPT. 

Celebrated compressed sensing results (e.g. p), ifTSll ) es- 
tablished that "good" measurement matrices exist. Here, by 
"good" measurement matrices we mean measurement matrices 
that have only m ~ 8(A;log(n/fc)) rows and can recover 
all (or almost all) fc-sparse signals under CS-LPD. Note that 
for the linear sparsity regime, k = Q{n), the optimal scaling 
requires to construct matrices with a number of measurements 
that scales linearly in the signal dimension n. 

One sufficient way to certify that a given measurement ma- 
trix is "good" is the well-known restricted isometry property 
(RIP), indicating that the matrix does not distort the ^2-norm 

' It is important to note that we worry only about the solution given by CS- 
LPD being equal (or very close) to the solution given by CS-OPT, because 
even CS-OPT might fail to correctly estimate the error vector in the above 
communication setup when the eiTor vector has too many large components. 
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of any /c-sparse vector by too much. If this is the case, the LP 
relaxation will be tight for all fc-sparse vectors e and further 
the recovery will be robust to approximate sparsity ||3], 0, 
ifTSl . As is well known, however, the RIP is not a complete 
characterization of the LP relaxation of "good" measurement 
matrices (see, e.g., |16|). In this paper we use the nullspace 
characterization instead (see, e.g., IfTTl . IfTSl ), that gives a 
necessary and sufficient condition for a matrix to be "good." 

Definition 1: Let S C I{Hcs) and let C € M^o- We say 
that Hcs has the nullspace property NSP^(5, C), and write 
J^cs €NSPf(5,C), if 

C-\\iys\\i lli^lli, for all u e NuIIspk(JJcs). 

We say that Hqs has the strict nullspace property 
NSP^(5,C), and write Hcs € NSP^(5,C), if 

C ■ < for all «y e NullspR(Ji'cs) \ {0}. 

□ 

Definition 2: Let k E Z^o and let C G M^o- We say 
that Hcs has the nullspace property NSPg(A;,C), and write 
JfcseNSPf(fc,C), if 

-Ffcs e NSPf (5, C), for all S C T(H'cs) with \S\ ^ k. 

We say that Hcs has the strict nullspace property 
NSP^(/c,C), and write Hcs G NSP^(fc,C), if 

Hcs e NSP^(5, C), for all S C I(ffcs) with \S\ s$ k. 

□ 

Note that in the above two definitions, C is usually chosen 
to be greater than or equal to 1. 

As was shown independently by several authors (see ifTSl - 
||2Jl and references therein) the nullspace condition in Def- 
inition |2] is a necessary and sufficient condition for a mea- 
surement matrix to be "good" for /c-sparse signals, i.e., that 
the estimate given by CS-LPD equals the estimate given 
by CS-OPT for these matrices. In particular, the nullspace 
characterization of "good" measurement matrices will be one 
of the keys to linking CS-LPD with CC-LPD. Observe that 
the requirement is that vectors in the nullspace of Hcs have 
their £1 mass spread in substantially more than k coordinates. 
(In fact, for C ^ 1, at least 2fc coordinates must be non-zero). 

The following theorem is adapted from |21 Proposition 2]. 

Theorem 3: Let Hcs be a measurement matrix. Further, 
assume that s = Hcs ■ e and that e has at most k nonzero 
elements, i.e., ||e||o ^ k. Then the estimate e produced by 
CS-LPD will equal the estimate e produced by CS-OPT if 
JfcseNSP<(A;,C=l). 

Remark: Actually, as discussed in 11211 . the condition 
Hcs G NSPR{fc, C = 1) is also necessary, but we will not 
use this here. 

The next performance metric (see, e.g., ifTTl . Il22ll ) for CS 
involves recovering approximations to signals that are not 
exactly /c-sparse. 

Definition 4: An £p/iq approximation guarantee for CS- 
LPD means that CS-LPD outputs an estimate e that is within 
a factor Cp,q{k) from the best /c-sparse approximation for e, 
i.e., 

\\e-e\\.p^Cp,q{k)- min ||e-e'||,, (1) 



where the left-hand side is measured in the ^p-norm and the 
right-hand side is measured in the i'^-norm. □ 
Note that the minimizer of the right-hand side of ([T]) (for 
any norm) is the vector e' e S^'l^ that has the k largest 
(in magnitude) coordinates of e, also called the best /c-term 
approximation of e |22|. Therefore the right-hand side of ([T]) 
equals Cp^q{k) ■ \\e-gT\\q where S* is the support set of the 
k largest (in magnitude) components of e. Also note that if 
e is /c-sparse then the above condition suggests that e — e 
since the right hand-side of (|T|) vanishes, therefore it is a 
strictly stronger statement than recovery of sparse signals. 
(Of course, such a stronger approximation guarantee for e 
is usually only obtained under stronger assumptions on the 
measurement matrix.) 

The nullspace condition is a necessary and sufficient condi- 
tion on a measurement matrix to obtain £i/ii approximation 
guarantees. This is stated and proven in the next theorem 
which is adapted from 1 17. Theorem 1 ] . (Actually, we omit the 
necessity part in the next theorem since it will not be needed 
in this paper) 

Theorem 5: Let Hcs be a measurement matrix, and let 
C > 1 be a real constant. Further, assume that s = Hcs • e. 
Then for any set S C I[Hcs) with \S\ ^ k the solution e 
produced by CS-LPD will satisfy 

l|e-e||i ^^--^j—J ■ ll^slli 

if ffcs€NSPf(fc,C). 

Proof: See Appendix lAl ■ 

IV. Channel Coding 
Linear Programming Decoding 

A. The Setup 

We consider coded data transmission over a memoryless 
channel with input alphabet Xcc — {0, 1}. output alphabet 
3^cc^ and channel law PY\x{y\x). The coding scheme will 
be based on a binary linear code Ccc of block length n and 
dimension k, k ^ n. In the following, we will identify Xcc 
with F2. 

• Let Gcc e F2 be a generator matrix for Ccc- Conse- 
quently, Gcc has rank k over F2, and information vectors 
M g F2 are encoded into codewords x £ Fj according to 
X = Gcc ■ u (mod 2), i.e., Ccc — {Gcc ■ u (mod 2) | 
u e F^}0 

• Let Hcc G F™^" be a parity-check matrix for Ccc- 
Consequently, Hcc has rank n — k ^ m over F2, and 
any x E Fj satisfies Hcc ■ x — (mod 2) if and only if 
X E Ccc, i-e., Ccc ^ {x E F^ | Hcc x^Q (mod 2)}. 

• In the following we will mainly consider the three 
following channels (see, for example, 1231 ): the binary- 
input additive white Gaussian noise channel (AWGNC, 
parameterized by its signal-to-noise ratio), the binary 
symmetric channel (BSC, parameterized by its cross- 
over probability), and the binary erasure channel (BEC, 
parameterized by its erasure probability). 

-We remind the reader that throughout this paper we are using column 
vectors, which is in contrast to the coding theory standai'd to use row vectors. 
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• Let y € be the received vector and define for each 
i G T{Hcc) the log-likelihood ratio Xi — \i{yi) = 

Upon observing Y = y, the (blockwise) maximum-likelihood 
decoding (MLD) rule decides for 

x{y) = argniaxPy|x(y|a;'), 

x'€Ccc 

where PY\x{y\x') = Hjei ^rixC^^la^D- Formally: 



CC-MLD : maximize PY\xiy\x') 
subject to x' e Cec- 



il is clear that instead of PY\x{yW) we can also maxi- 
mize \ogPY\x{y\x') = Y^tex^ogPyixiyiW,)- Noting that 
logPrixivM) = -\x',+\ogPYlxiy^\0) for x[ € {0,1}, 
CC-MLDl can then be rewritten to read 



CC-MLDl : minimize (A, x') 

subject to x' e Ccc- 



Because the cost function is linear, and a linear function attains 
its minimum at the extremal points of a convex set, this is 
essentially equivalent to 



CC-MLD2 : minimize (A, x') 

subject to x' E conv(Ccc)- 



(Here, conv(Ccc) denotes the convex hull of Ccc after it has 
been embedded in M" . Note that we wrote "essentially equiv- 
alent" because if more than one codeword in Ccc is optimal 
for CC-MLDl then all points in the convex hull of these 
codewords are optimal for CC-MLD2.) Although CC-MLD2 
is a linear program, it usually cannot be solved efficiently 
because its description complexity is typically exponential in 
the block length of the code0 

However, one might try to solve a relaxation of CC-MLD2. 
Namely, as proposed by Feldman, Wainwright, and Karger ||5J, 
||6l . we can try to solve the optimization problem 

'On the side, let us remark that if ycc is binary then 3^00 identified 
with F2 and we can write y = x + e (mod 2) for a suitably defined vector 
e G Fj, which will be called the en'or vector. Moreover, we can define the 
syndrome vector s = Hqq ■ y (mod 2). Note that 

s = Hoc ■ (x + e) = Hcc ■ X + Hcc ■ e 
= Hcc ■ 6 (mod 2). 

However, in the following, with the exception of Section I VIII we will only 
use the log-likelihood ratio vector A, and not the binary syndrome vector s. 
(See Definition |20] for a way to define a syndrome vector also for non-binary 
channel output alphabets ^00 ) 

"^Examples of code families that have sub-exponential description complex- 
ities in the block length are convolutional codes (with fixed state-space size), 
cycle codes (i.e., codes whose Tanner graph has only degree-2 vertices), and 
tree codes (i.e., codes whose Tanner graph is a tree). (For more on this topic, 
see for example 1241.) However, these classes of codes are not good enough 
for achieving performance close to channel capacity even under ML decoding 
(see, for example, 1251 .) 



CC-LPD : minimize (A, x') 

subject to x' e V{Hcc), 



where the relaxed set V{Hcg) ^ conv(Ccc) is given in the 
next definition. 

Definition 6: For every j E J'(i?cc)^ let hj be the j-th 
row of Hqq and let 

Ccc.j ^{xe¥^ \ {hj,x) = (mod 2)}. 

Then, the fundamental poly tope V ^ V{Hcc) of Hcc is 
defined to be the set 

V^V{Hcc)= n conv(Ccc,j). 

jeJlHcc) 

Vectors in V{Hcc) will be called pseudo-codewords. □ 
In order to motivate this choice of relaxation, note that the 
code Ccc can be written as 

Ccc = Ccc.i n • • • n Ccc.m, 

and so 

conv(Ccc) = conv(Ccc,i n • • • n Ccc,™) 

C conv(Ccc,i) n • • • n conv(Ccc,m) 
= V{Hcc)- 

It can be verified ||5|, IS) that this relaxation possesses the 
important property that all the vertices of conv(Ccc) are also 
vertices of V{Hgc)- Let us emphasize that different parity- 
check matrices for the same code usually lead to different 
fundamental polytopes and therefore to different CC-LPDs. 

Similarly to the compressed sensing setup, we want to 
understand when we can guarantee that the codeword estimate 
given by CC-LPD equals the codeword estimate given by 
cc-mld| Clearly, the performance of CC-MLD is a natural 
upper bound on the performance of CC-LPD, and a way to 
assess CC-LPD is to study the gap to CC-MLD, e.g., by 
comparing the here-discussed performance guarantees for CC- 
LPD with known performance guarantees for CC-MLD. 

When characterizing the CC-LPD performance of binary 
linear codes over binary-input output-symmetric memoryless 
channels we can, without loss of generality, assume that the 
all-zero codeword was transmitted Q, Q. With this, the 
success probability of CC-LPD is the probability that the 
all-zero codeword yields the lowest cost function value when 
compared to all non-zero vectors in the fundamental polytope. 
Because the cost function is linear, this is equivalent to the 
statement that the success probability of CC-LPD equals the 
probability that the all-zero codeword yields the lowest cost 
function value compared to all non-zero vectors in the conic 

^It is important to note, as we did in the compressed sensing setup, that 
we wony mostly about the solution given by CC-LPD being equal to the 
solution given by CC-MLD. because even CC-MLD might fail to correctly 
identify the codeword that was sent when the error vector is beyond the eiTor 
connection capability of the code. 
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hull of the fundamental polytope. This conic hull is called the 
fundamental cone JC ~ IC(Hcc) and it can be written as 

JC ^ JC{Hcc) = conic {V(Hcc)) = f) conic(Cccj). 

ieJ(Hcc) 

The fundamental cone can be characterized by the inequalities 
listed in the following lemma |]5l-|]8l, ||26l . (Similar inequal- 
ities can be given for the fundamental polytope but we will 
not list them here since they are not needed in this paper.) 

Lemma 7: The fundamental cone JC = IC{Hcc) of Hqc 
is the set of all vectors u) e R" that satisfy 

^ 0, for all i e I, (2) 

oji ^ oj,;/, for all i ^ J and all i G X, . (3) 

□ 

Note that in the following, not only vectors in the fun- 
damental polytope, but also vectors in the fundamental cone 
will be called pseudo-codewords. Moreover, if ifcs is a zero- 
one measurement matrix, i.e., a measurement matrix where all 
entries are in {0, 1}, then we will consider .fJcs to represent 
also the parity-check matrix of some linear code over F2. 
Consequently, its fundamental polytope will be denoted by 
ViHcs) and its fundamental cone by JC{Hcs)- 

B. Conditions for the Equivalence of CC-LPD and CC-MLD 

The following lemma gives a sufficient condition on Hcc 
for CC-LPD to succeed over a BSC. 

Lemma 8: Let Hqc be a parity-check matrix of a code Cqc 
and let S C I{Hcc) be the set of coordinate indices that are 
flipped by a BSC with non-zero cross-over probability. If Hcc 
is such that 

< ll^^lli (4) 

for all a; e JC{Hcc) \ {0}, then the CC-LPD decision equals 
the codeword that was sent. 

Remark: The above condition is also necessary; however, 
we will not use this fact in the following. 

Proof See Appendix IB] ■ 

Note that the inequality in (|4| is identical to the inequality 
that appears in the definition of the strict nullspace property 
for C = 1 (!). This observation makes one wonder if there is 
a deeper connection between CS-LPD and CC-LPD beyond 
this apparent one, in particular for measurement matrices that 
contain only zeros and ones. Of course, in order to formalize 
a connection we first need to understand how points in the 
nullspace of a zero-one measurement matrix Hcs can be 
associated with points in the fundamental polytope of the 
parity-check matrix Hcs (now seen as a parity-check matrix 
for a code over F2). Such a mapping will be exhibited in the 
upcoming Section |V] Before turning to that section, though, 
we need to discuss pseudo-weights, which are a popular 
way of measuring the importance of the different pseudo- 
codewords in the fundamental cone and which will be used 
for establishing performance guarantees for CC-LPD. 



C. Definition of Pseudo-Weights 

Note that the fundamental polytope and cone are functions 
only of the parity-check matrix of the code and not of the chan- 
nel. The influence of the channel is reflected in the pseudo- 
weight of the pseudo-codewords, so it is only natural that every 
channel has its own pseudo-weight definition. Therefore, every 
communication channel model comes with the right measure 
of "distance" that determines how often a (fractional) vertex 
is incorrectly chosen in CC-LPD. 

Definition 9 ( ISSl, iQEjj): Let a; be a nonzero 

vector in R"q with u) = {uji, . . . ,a;„). 

• The AWGNC pseudo-weight of u) is defined to be 

AWGNC/ N A 

• In order to define the BSC pseudo-weight w^^'~^{u}), we 
let u)' be the vector with the same components as uj but 
in non-increasing order, i.e., u:' is a "sorted version" of 
Lj. Now let 

With this, the BSC pseudo-weight Wp^'^(a;) of uj is 
defined to be w^^'^{u}) = 2e. 

• The BEC pseudo-weight of u) is defined to be 

w^^'^iu;) - |supp(u;)|. 

• The max-fractional weight of u) is defined to be 

/ N A W^Wl 

Wmax-frac('*^j = J, 7, ■ 

ll'^lloo 

For u; = we define all of the above pseudo-weights and the 
max-fractional weight to be zero0 □ 
For a parity-check matrix Hcc, the minimum AWGNC 
pseudo-weight is defined to be 

AWGNC,ni„(jj ) A t.AWGNC(^) 

min u;^WGNC(^)^ 

u;e/C(Hcc)\{0} P 

The minimum BSC pseudo-weight it;p^^'"""(Jfcc)^ the min- 
imum BEC pseudo-weight u;p^^'™'"(i?cc)^ and the mini- 
mum max-fractional weight w'^^]^_fj.g^^{Hcc) of Hcc are de- 
fined analogously. Note that although w^]^_ij.^^{Hcc) yields 
weaker performance guarantees than the other quantities ISl, 
it has the advantage of being efficiently computable fSl, ||6l. 

There are other possible definitions of a BSC pseudo- 
weight. For example, the BSC pseudo-weight of lj can also 
be taken to be 

U;BSC'f^) A if ll'^{l,...,e} II l = ll^e+l....,n} H i 

' ^ ' 12^-1 if IKw}lli>IKe+i....Mlli' 

detailed discussion of the motivation and significance of tiiese definitions 
can be found in (8). 
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where u)' is defined as in Definition |9] and where e is the 



■'{e+l,...,n}\ 



smallest integer such that gj| 

This definition of the BSC pseudo-weight was for example 
used in ||29l. (Note that in |l28l the quantity w^'^'^' [oj) was 



introduced as "BSC effective weight.") 
Of course, the values Wp^'-'(ci;) and 



w. 



BSC 



connected. Namely, if 



,,BSC' 



(a;) 



w. 



BSC/' 



yBSC'/ 



(o;) are tightly 



w^T""" (a;) 



is an even integer then 



(a;), and if w^^'^ (w) is an odd integer 



then w?,^^' {(jj) - 1 < w?,^^{u}) < 



„BSC' 



{u>) + 1. 



The following lemma establishes a connection between BSC 
pseudo-weights and the condition that appears in Lemma [8] 

Lemma 10: Let Hqq be a parity-check matrix of a code 
Ccc and let u) be an arbitrary non-zero pseudo-codeword of 
Hcc, i-e., w e /C(i?cc)\{0}. Then, for all sets S C I(i?cc) 
with 

1 ,,BSC/, .^ -tu ICI ^ 1 „„BSC'/ 



151 < 



it holds that 



u;p"-(a;) or with \S\ < 2 '% 



Proof: See Appendix ICl ■ 

V. Establishing a Bridge Between 
CS-LPD AND CC-LPD 

We are now ready to establish the promised bridge between 
CS-LPD and CC-LPD to be used in Section [VT] to translate 
performance guarantees from one setup to the other. Our main 
tool is a simple lemma that was already established in ll30l . 
but for a different purpose. 

We remind the reader that we have extended the use of 
the absolute value operator | • | from scalars to vectors. So, if 
a = is a real (complex) vector then we define \a\ to be 
the real (complex) vector a' — {a'j)i with the same number of 
components as a and with entries a'^ = \ai\ for all i. 

Lemma 11 (Lemma 6 in Let ffcs be a zero-one 

measurement matrix. Then 



i> e NullspR(iJ( 



e /C(Jfcs). 



Remark: Note that supp(i/) = suppdi^j). 

Proof: Let u = \v\. In order to show that such a vector 
is indeed in the fundamental cone of -Ffcs, we need to 
verify (|2|i and (O. The way a; is defined, it is clear that 
it satisfies (|2|i. Therefore, let us focus on the proof that lj 
satisfies (|3]l. Namely, from v £ Nullspj{(iifcs) it follows 



that for all j e J, T,iei^3 



0, i.e., for all j e J, 



E 



0. This implies 



w,: = ly, = 



E 



E I 

'eXj\i 



E ' 

i'ex,\i 



for all j E 
satisfies (|3]l. 



J' and all i e X,, showing that u) indeed 



This lemma gives a one-way result; with every point in the 
M-nullspace of the measurement matrix iJcs we can associate 
a point in the fundamental cone of i?cs> but not necessarily 



vice-versa. Therefore, a problematic point for the R-nuUspace 
of Hqs will translate to a problematic point in the fundamental 
cone of -ffcs ™d hence to bad performance of CC-LPD. 
Similarly, a "good" parity-check matrix Hcs must have no 
low pseudo-weight points in the fundamental cone, which 
means that there are no problematic points in the M-nullspace 
of Hcs- Therefore, "positive" results for channel coding will 
translate into "positive" results for compressed sensing, and 
"negative" results for compressed sensing will translate into 
"negative" results for channel coding. 

Further, Lemma [TT] preserves the support of a given point 
u. This means that if there are no low pseudo-weight points 
in the fundamental cone of Hqs with a given support, there 
are no problematic points in the R-nullspace of Hqs with 
the same support, which allows point-wise versions of all our 
results in Section IVll 

Note that Lemma [TT| assumes that Hcs is a zero-one 
measurement matrix, i.e., that it contains only zeros and ones. 
As we show in Appendix |Dl there are suitable extensions 
of this lemma that put less restrictions on the measurement 
matrix. However, apart from Remark [191 we will not use 
these extensions in the following. (We leave it as an exercise 
to extend the results in the upcoming sections to this more 
general class of measurement matrices.) 

VI. Translation of Performance Guarantees 

In this section we use the above-established bridge be- 
tween CS-LPD and CC-LPD to translate "positive" results 
about CC-LPD to "positive" results about CS-LPD. Whereas 
Sections IVI-AI to IVI-EI focus on the translation of abstract 
performance bounds. Section IVI-FI presents the translation of 
numerical performance bounds. Finally, in Section IVI-GI we 
briefly discuss some limitations of our approach when dense 
measurement matrices are considered. 

A. The Role of the BSC Pseudo-Weight for CS-LPD 

Lemma 12: Let i/cs G {0, 1}™^" be a CS measurement 
matrix and let A; be a non-negative integer Then 



w: 



BSC, mill 



CS. 



> 2k 



ifcseNSP<(fc,C = l). 



Proof: Fix some v e Nullspjj(-H'cs)\{0}. By Lemma [TTI 
we know that \v\ is a pseudo-codeword of Hqs^ and by the 
assumption Wp^'^'"^"\Hcs) > 2fc we know that Wp^'-'(|iv|) > 
2k. Then, using Lemma [TOl we conclude that for all sets 
5 C I with \S\ ^ k, we must have Wi^sWi = II Ws\ ||i < 
Because 1/ was arbitrary, the claim 
1) clearly follows. ■ 



11 = ii'^m- 



lfcs€NSP<(fc,C = 



This result, along with Theorem [3] can be used to establish 
sparse signal recovery guarantees for a compressed sensing 
matrix -ffcs- 

Note that compressed sensing theory distinguishes between 
the so-called strong bounds and the so-called weak bounds. 
The former bounds correspond to a worst-case setup and guar- 
antee the recovery of all fc-sparse signals, whereas the latter 
bounds correspond to an average-case setup and guarantee the 
recovery of a signal on a randomly selected support with high 



g 



probability regardless of the values of the non-zero entries. 
Note that a further notion of a weak bound can be defined if 
we randomize over the non-zero entries also, but this is not 
considered in this paper. 

Similarly, for channel coding over the BSC, there is a 
distinction between being able to recover from k worst-case 
bit-flipping errors and being able to recover from randomly 
positioned bit-flipping errors. 

In particular, recent results on the performance analysis of 
CC-LPD have shown that parity-check matrices constructed 
from expander graphs can correct a constant fraction (of the 
block length n) of worst-case errors (cf. I.12J) and random 
errors {cf. ifTOl . ifTSl ). These worst-case error performance 
guarantees implicitly show that the minimum BSC pseudo- 
weight of a binary linear code defined by a Tanner graph with 
sufficient expansion (expansion strictly larger than 3/4) must 
grow linearly in n. (A conclusion in a similar direction can 
be drawn for the random error setup.) Now, with the help of 
Lemma [12] we can obtain new performance guarantees for 
CS-LPD. 

Let us mention that in ifTTl . ||3T1 . Il32l . expansion arguments 
were used to directly obtain similar types of performance guar- 
antees for compressed sensing; in Section IVI-FI we compare 
these results to the guarantees we can obtain through our 
translation techniques. 

In contrast to the present subsection, which deals with the 
recovery of (exactly) sparse signals, the next three subsections 
(Sections IVI-BI IVI-Ci and IVI-DI i deal with the recovery of 
approximately sparse signals. Note that the type of guarantees 
presented in these subsections are known as instance opti- 
mality guarantees 



B. The Role of Binary-Input Channels Beyond the BSC for 
CS-LPD 

In Lemma [12] we established a connection between, on the 
one hand, performance guarantees for the BSC under CC- 
LPD, and, on the other hand, the strict nullspace property 
NSPa(A;,C) for C = 1. It is worthwhile to mention that 
one can also establish a connection between performance 
guarantees for a certain class of binary-input channels under 
CS-LPD and the strict nullspace property NSPg(fc,C) for 
C > 1. Without going into details, this connection is es- 
tablished with the help of results from [33 |, that generalize 
results from [12], and which deal with a class of binary- 
input memoryless channels where all output symbols are such 
that the magnitude of the corresponding log-likelihood ratio is 
bounded by some constant W e R>olZ|This observation, along 
with Theorem [5] can be used to establish instance optimality 
^i/^i guarantees for a compressed sensing matrix .fJcs- Let 
us point out that in some recent follow-up work 1341 this has 
been accomplished. 

'Note that in |33 |, "This suggests that the asymptotic advantage over [. . .] 
is gained not by quantization, but rather by restricting the LLRs to have finite 
support." should read "This suggests that the asymptotic advantage over [. . . ] 
is gained not by quantization, but rather by restricting the LLRs to have 
bounded support." 



C. Connection between AWGNC Pseudo -Weight and 
Guarantees 

Theorem 13: Let ifcs S {0,1}™^" be a measurement 
matrix and let s and e be such that s = -f^cs ■ e. Let 
S C Z(J?cs) with \S\ — k, and let C be an arbitrary positive 
real number with C" > 4k. Then the estimate e produced by 
CS-LPD will satisfy 



with 



- 1 



if^AWGNC 



1^1) > C" holds for all u e NullspR(i?cs)\{0}. 
(In particular, this latter condition is satisfied for a measure- 



ment matrix Hqs with 

Proof: See Appendix [E] 



AWGNC, min 



(Hcs) > C.) 



D. Connection between Max-Fractional Weight and ioo/ii 
Guarantees 

Theorem 14: Let i?cs G {0,1}'"^" be a measurement 
matrix and let s and e be such that s = -ffcs • e. Let 
S C I{Hgs) with \S\ = k, and let C be an arbitrary positive 
real number with C" > 2k. Then the estimate e produced by 
CS-LPD will satisfy 



with 



C" ^ 



1 



QL 

2k 



1 



if Wmax-fracdi'D^C" holds for all u e NullspR(i?cs) \ {0}. 
(In particular, this latter condition is satisfied for a measure- 
ment matrix Hcs 

Proof: See Appendix |F] 



with w'^i^_,^^{Hcs) > C.) 



E. Connection between BEC Pseudo-Weight and CS-LPD 

For the binary erasure channel, CC-LPD is identical to the 
peeling decoder (see, e.g., |23, Chapter 3.19]) that solves a 
system of linear equations by only using back-substitution. 

We can define an analogous compressed sensing problem by 
assuming that the support of the sparse signal e is known to 
the decoder, and that the recovering of the values is performed 
only by back-substitution. This simple procedure is related to 
iterative algorithms that recover sparse approximations more 
efficiently than by solving an optimization problem (see, e.g., 
Ii35l - li38l and references therein). 

For this special case, it is clear that CC-LPD for the BEC 
and the described compressed sensing decoder have identical 
performance since back-substitution behaves exactly the same 
way over any field, be it the field of real numbers or any 
finite field. (Note that whereas the result of CC-LPD for the 
BEC equals the result of the back-substitution-based decoder 
for the BEC, the same is not true for compressed sensing, 
i.e., CS-LPD with given support of the sparse signal can be 
strictly better than the back-substitution-based decoder with 
given support of the sparse signal.) 
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F. Explicit Performance Results 

In this section we use the bridge lemma, Lemma [TTl along 
with previous positive performance results for CC-LPD, to 
establish performance results for the CS-LPD / basis pursuit 
setup. In particular, three positive threshold results for CC- 
LPD of low-density parity-check (LDPC) codes are used to 
obtain three results that are, to the best of our knowledge, 
novel for compressed sensing: 

• Corollary [16] (which relies on work by Feldman, Malkin, 
Servedio, Stein, and Wainwright L12J) is very similar 
to lim . II3TII . II32I . although our proof is obtained through 
the connection to channel coding. We obtain a strong 
bound with similar expansion requirements. 

• Corollary [17] (which relies on work by Daskalakis, 
Dimakis, Karp, and Wainwright |10|) is a result that 
yields better constants {i.e., larger recoverable signals) 
but only with high probability over supports {i.e., it is a 
so-called weak bound). 

• Corollary [18] (which relies on work by Arora, Daskala- 
kis, and Steurer [ 13 1) is, in our opinion the most important 
contribution. We show the first deterministic construction 
of compressed sensing measurement matrices with an 
order-optimal number of measurements. Further we show 
that a property that is easy to check in polynomial time 
{i.e., girth), can be used to certify measurement matrices. 
Further, in the follow-up paper 1 34 1 it is shown that sim- 
ilar techniques can be used to construct the first optimal 
measurement matrices with £1/^1 sparse approximation 
properties. 

At the end of the section we also use Lemma [25] {cf. Ap- 
pendix [D]) with I • 1^ = I • I to study dense measurement 
matrices with entries in { — 1,0,+!}. 

Before we can state our first translation result, we need to 
introduce some notation. 

Definition 15: Let G be a bipartite graph where the nodes 
in the two node classes are called left-nodes and right-nodes, 
respectively. If S is some subset of left-nodes, we let Af{S) 
be the subset of the right-nodes that are adjacent to S. Then, 
given parameters dv G Z>o, 7 € (0, 1), S g (0, 1), we say that 
G is a (dv, 7, (5)-expander if all left-nodes of G have degree 
and if for all left-node subsets S with \S\ ^7- 1 {left— nodes} | 
it holds that \J\f{S)\ ^ Sdy ■ \S\. □ 
Expander graphs have been studied extensively in past work 
on channel coding (see, e.g., 1391 ) and compressed sensing 
(see, e.g., II3TI . |[32l ). It is well known that randomly con- 
structed left-regular bipartite graphs are expanders with high 
probability (see, e.g., [12]). 

In the following, similar to the way a Tanner graph is 
associated with a parity-check matrix ["401, we will associate 
a Tanner graph with a measurement matrix. Note that the 
variable and constraint nodes of a Tanner graph will be called 
left-nodes and right-nodes, respectively. 

With this, we are ready to present the first translation 
result, which is a so-called strong bound {cf. the discussion 
in Section FVI-Ab . It is based on a theorem from |,12J . 

Corollary 16: Let d^ e Z>o and 7 e (0, 1). Let Hcs € 
{0,1}™^" be a measurement matrix such that the Tanner 



graph of Hcs is a (dv, 7, '5)-expander with sufficient expan- 
sion, more precisely, with 



-1 



1 



(along with the technical condition Sd^ £ Z>o)- Then CS- 
LPD based on the measurement matrix Hcs can recover all 
fc-sparse vectors, i.e., all vectors whose support size is at most 
k, for 

36-2 , 

Proof: This result is easily obtained by combining 
Lemma [TT] with Itl2i Theorem 1]. ■ 

Interestingly, for (5 = 3/4 the recoverable sparsity k matches 
exactly the performance of the fast compressed sensing algo- 
rithm in on . |[32| and the performance of the simple bit- 
flipping channel decoder of Sipser an Spielman ||39l , how- 
ever, our result holds for the CS-LPD / basis pursuit setup. 
Moreover, using results about expander graphs from |12|, the 
above corollary implies, for example, that, for m/n = 1/2 
and dv = 32, sparse expander-based zero-one measurement 
matrices will recover all k — an sparse vectors for a ^ 
0.000175. To the best of our knowledge, the only previously 
known result for sparse measurement matrices under basis 
pursuit is the work of Berinde et al. lITTl . As shown by the 
authors of that paper, the adjacency matrices of expander 
graphs (for expansion 5 > 5/6) will recover all fc-sparse 
signals. Further, these authors also state results giving £i/ii 
instance optimality sparse approximation guarantees. Their 
proof is directly done for the compressed sensing problem 
and is therefore fundamentally different from our approach 
which uses the connection to channel coding. The result of 
Corollary [16] implies a strong bound for all fc-sparse signals 
under basis pursuit and zero-one measurement matrices based 
on expander graphs. Since we only require expansion 5 > 3/4, 
however, we can obtain slightly better constants than fTTj. 
Even though we present the result of recovering exactly fc- 
sparse signals, the results of |33 | can be used to establish £i/ii 
sparse recovery for the same constants. We note that in the 
linear sparsity regime fc ~ an, the scaling of m ~ cn is order 
optimal and also the obtained constants are the best known for 
strong bounds of basis pursuit. Still, these theoretical bounds 
are quite far from the observed experimental performance. 
Also note that the work by Zhang and Pfister [37] and by Lu 
et al. [38 1 use density evolution arguments to determine the 
precise threshold constant for sparse measurement matrices, 
but these are for message-passing decoding algorithms which 
are often not robust to noise and approximate sparsity. 

In contrast to Corollary[T6]that presented a strong bound, the 
following corollary presents a so-called weak bound {cf. the 
discussion in Section IVI-Ab . but with a better threshold. 

Corollary 17: Let d^ G Z>o. Consider a random measure- 
ment matrix .ffcs G {0, 1}™^" formed by placing dv random 
ones in each column, and zeros elsewhere. This measurement 
matrix succeeds in recovering a randomly supported fc = an 
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sparse vector with probability 1 — o(l) if a is below some 
threshold value a„i{dv , m / n) . 

Proof: The result is obtained by combining Lemma [TT] 
with ifTOl Theorem 1]. The latter paper also contains a way to 
compute the achievable threshold values am{dv,m/n). ■ 

Using results about expander graphs from ifTOl . the above 
corollary implies, for example, that for m/n — 1/2 and 
dv = 8, a random measurement matrix will recover with 
high probability a k = an sparse vector with random support 
if a ^ 0.002. This is, of course, a much higher threshold 
compared to the one presented above, but it only holds with 
high probability over the vector support (therefore it is a so- 
called weak bound). To the best of our knowledge, this is 
the first weak bound obtained for random sparse measurement 
matrices under basis pursuit. 

The best thresholds known for LP decoding were recently 
obtained by Arora, Daskalakis, and Steurer |13| but require 
matrices that are both left and right regular and also have 
logarithmically growing girthQ A random bipartite matrix will 
not have logarithmically growing girth but there are explicit 
deterministic constructions that achieve this (for example the 
construction presented in Gallager's thesis llT4l Appendix C]). 

Corollary 18: Let dv,dc G Z>o. Consider a measurement 
matrix Hcs G {0, 1}™^" whose Tanner graph is a {dv,dc)- 
regular bipartite graph with n{\ogn) girth. This measurement 
matrix succeeds in recovering a randomly supported k — an 
sparse vector with probability 1 — o(l) if a is below some 
threshold function a'j^{dv,dc,m/n). 

Proof: The result is obtained by combining Lemma [TT] 
with ITJ" Theorem 1 ] . The latter paper also contains a way to 
compute the achievable threshold values a'j^{dv, dc,m/n). ■ 

Using results from |I3|, the above corollary yields for 
m/n = 1/2 and a (3, 6)-regular Tanner graph with logarithmic 
girth (obtained from Gallager's construction) the fact that 
sparse vectors with sparsity k — an are recoverable with high 
probability for a ^ 0.05. Therefore, zero-one measurement 
matrices based on Gallager's deterministic LDPC construction 
form sparse measurement matrices with an order-optimal 
number of measurements (and the best known constants) for 
the CS-LPD / basis pursuit setup. 

A note on deterministic constructions: We say that a 
method to construct a measurement matrix is deterministic if 
it can be created deterministically in polynomial time, or it has 
a property that can be verified in polynomial time. Unfortu- 
nately, all known bipartite expansion-based constructions are 
non-deterministic because even though random constructions 
will have the required expansion with high probability, there 
is, to the best of our knowledge, no known efficient way 
to check expansion above S > 1/2. Similarly, there are no 
known ways to verify the nuUspace property or the restricted 
isometry property of a given candidate measurement matrix in 
polynomial time. 

^However, as shown in 1:41] , these requirements on the left and right degrees 
can be significantly relaxed. 



There are several deterministic constructions of sparse mea- 
surement matrices (42], (43\ which, however, would require a 
slightly sub-optimal number of measurements (i.e., ni growing 
super- linearly as a function of n for k — an). The benefit 
of such constructions is that reconstruction can be performed 
via algorithms that are more efficient than generic convex 
optimization. To the best of our knowledge, there are no 
previously known constructions of deterministic measurement 
matrices with an optimal number of rows |44|. The best known 
constructions rely on explicit expander constructions ll45l . 
(4E\, but have slightly sub-optimal parameters ifTTl . Il44l . Our 
construction of Corollary [18] seems to be the first optimal 
deterministic construction. 

One important technical innovation that arises from the 
machinery we develop is that girth can be used to certify 
good measurement matrices. Since checking and constructing 
high-girth graphs is much easier than constructing graphs 
with high expansion, we can obtain very good deterministic 
measurement matrices. For example, we can use Gallager's 
construction of LDPC matrices with logarithmic girth to obtain 
sparse zero-one measurement matrices with an order-optimal 
number of measurements under basis pursuit. The transition 
from expansion-based arguments to girth-based arguments 
was achieved for the channel coding problem in |47|, then 
simplified and brought to a new analytical level by Arora et 
al. in |T3l, and afterwards generalized in \4T\. Our connection 
results extend the applicability of these results to compressed 
sensing. 

We note that Corollary [18] yields a weak bound, i.e., the 
recovery of almost all fc-sparse signals and therefore does 
not guarantee recovering all fc-sparse signals as the Capalbo 
et al. [45 1 construction (in conjunction with Corollary [T6l l 
would ensure. On the other hand, girth-based constructions 
have constants that are orders of magnitude higher than the 
ones obtained by random expanders. Since the construction 
of (45 1 gives constants that are worse than the ones for random 
expanders, it seems that girth-based measurement matrices 
have significantly higher provable thresholds of recovery. 
Finally, we note that following 1 13 1, logarithmic girth ri(log n) 
will yield a probability of failure decaying exponentially in 
the matrix size n. However, even the much smaller girth 
requirement O (log log n) is sufficient to make the probability 
of error decay as an inverse polynomial of n. 

A final remark: Chandar |48| showed that zero-one mea- 
surement matrices cannot have an optimal number of mea- 
surements if they must satisfy the restricted isometry property 
for the ^2 norm. Note that this does not contradict our work, 
since, as mentioned earlier on, RIP is just a sufficient condition 
for signal recovery. 

G. Comments on Dense Measurement Matrices 

We conclude this section with some considerations about 
dense measurement matrices, highlighting our current under- 
standing that the translation of positive performance guar- 
antees from CC-LPD to CS-LPD displays the following 
behavior: the denser a measurement matrix is, the weaker the 
translated performance guarantees are. 
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Remark 19: Consider a randomly generated m x n mea- 
surement matrix Hqs where every entry is generated i.i.d. 
according to the distribution 

1 with probabihty 1/6 
with probability 2/3 . 
— 1 with probability 1/6 

This matrix, after multiplying it by the scalar ^/sjn, has 
the restricted isometry property (RIP) with high probability. 
(See [49 1, which proves this property based on results in [50|, 
which in turn proves that this family of matrices has a non-zero 
threshold.) On the other hand, one can show that the family 
of parity-check matrices where every entry is generated i.i.d. 
according to the distribution 

{1 with probability 1/3 
with probability 2/3 

does not have a non-zero threshold under CC-LPD for the 
BSC □ 
Therefore, we conclude that the connection between CS- 
LPD and CC-LPD given by Lemma |25] (an extension of 
Lemma [TT] that is discussed in Appendix |D]i is not tight for 
dense matrices, in the sense that the performance of CS- 
LPD for dense measurement matrices can be much better than 
predicted by the translation of performance results for CC- 
LPD of the corresponding parity-check matrix. 

VII. Reformulations based on Graph Covers 

The aim of this section is to tighten the already close 
formal relationship between CC-LPD and CS-LPD with the 
help of (topological) graph covers 1521 , ||53l . We will see 
that the so-called (blockwise) graph-cover decoder JS] (see 
also ||54| ), which is equivalent to CC-LPD and which can be 
used to explain the close relationship between CC-LPD and 
message-passing iterative decoding algorithms like the min- 
sum algorithm, can be translated to the CS-LPD setup. 

For an introduction to graph covers in general, and the 
graph-cover decoder in particular, see |8|. Figures [T] and |2] 
(taken from [8|) show the main idea behind graph covers. 
Namely, Figure [T] shows possible graph covers of some (gen- 
eral) graph and Figure |2] shows possible graph covers of some 
Tanner graph. 

Note that in this section the compressed sensing setup will 
be over the complex numbers. Also, the entries of the size- 
m X n measurement matrix i?cs will be allowed to take on 
any value in C, i.e., the entries of Hcs are not restricted 
to have absolute value equal to zero or one. Moreover, as in 
Section |IV] the channel coding problem assumes an arbitrary 
binary-input output-symmetric memoryless channel, of which 
the binary-input additive white Gaussian noise (AWGN) chan- 
nel and the binary symmetric channel (BSC) are prominent 
examples. As before, x e {0, 1}" will be the sent vector, 

y G 3^" will be the received vector, and A e K" will contain 

'PYix{y.\o)\ 





the log-likelihood ratios A,; = \i{yi) = log 

The rest of this section is organized as follows. In Sec- 
tions IVII-AI and IVII-BI we show a variety of reformulations of 





Fig. 1. Top left: base graph G. Top right: a sample of possible 2-covers of 
G. Bottom left: a possible 3-cover of G. Bottom right: a possible M-cover 
of G. Here, , . . . , (Teg are arbitrary edge permutations. 





Fig. 2. Left: Tanner graph T(ff). Middle: a possible 3-cover of T(//). 
Right: a possible A/-cover of T(Jf). Here, are arbitrary edge 

permutations. 



CC-MLD and CC-LPD, respectively. In particular, the latter 
subsection shows reformulations of CS-LPD in terms of graph 
covers. Switching to compressed sensing, in Section IVII-CI 
we discuss reformulations of CS-OPT that allow to see the 
close relationship of CC-MLD and CS-OPT. Afterwards, in 
Section IVII-DI we present reformulations of CS-LPD which 
highlight the close connections, and also the differences, 
between CC-LPD and CS-LPD. 

A. Reformulations of CC-MLD 

This subsection discusses several reformulations of CC- 
MLD, first for general binary-input output-symmetric mem- 
oryless channels, then for the BSC. We start by repeating two 
reformulations of CC-MLD from Section HVl 

CC-MLDl : minimize (A, x') 

subject to x' E Ccc- 

CC-MLD2 : minimize (A, x') 

subject to x' E conv(Ccc)- 
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Towards yet another reformulation of CC-MLD that we 
would like to present in this subsection, it is useful to introduce 
the hard-decision vector y, along with the syndrome vector s 
induced by y. 

Definition 20: Let y E be the hard-decision vector 
based on the log-likelihood ratio vector A, namely let 



if Xi > 

1 if A; < 



(for all i e I). 



(If Xi = 0, we set — or ^ 1 according to some 
deterministic or random rule.) Moreover, let 



cc 



y (mod 2) 



be the syndrome induced by y. □ 
Clearly, if the channel under consideration is a BSC with 

cross-over probability smaller than 1/2 then y = y. 

With this, we have for any binary-input output-symmetric 

memoryless channel the following reformulation of CC-MLD 

in terms of e' = y — a;' (mod 2). 



CC-MLD3 : minimize ll-^supp(e') II i 

subject to Hcc ■ e' = s (mod 2). 



Clearly, once the error vector estimate e' is found, the code- 
word estimate x' is obtained with the help of the expression 
x' ~ y — e' (mod 2). 

Note that for the special case of a binary-input AWGNC, 
this reformulation can be found, for example, in [55 j or Ii56. 
Chapter 10]. 

Theorem 21: CC-MLD3 is a reformulation of CC-MLDl. 
Proof: See Appendix iGl ■ 

For a BSC we can specialize the above reformulations. 
Namely, for a BSC with cross-over probability e, ^ £ < 1/2, 
we have |Ai| = L, i £ I, where L = log(i^) > 0. Then, 
with a slight abuse of notation by employing || • ||i also for 
vectors over F2, we obtain the following reformulation. 



CC-LPD : minimize (A, x') 

subject to x' G V{Hcc). 

The aim of this subsection is to discuss various reformulations 
of CC-LPD in terms of graph covers. In particular, the 
following reformulation of CC-LPD was presented in fS] and 
was called (blockwise) graph-cover decoding. 



CC-LPDl 



minimize — ■ {\^^^ ,x'\ 



subject to Hcc ■ x' = O'^*^ (mod 2) 



Here the minimization is over all M e Z>o and over all parity- 
check matrices Hcc induced by all possible A/-covers of the 
Tanner graph of i?ccll 

Using the same line of reasoning as in Section IVII-AI CC- 
LPD can be rewritten as follows. 



CC-LPD2 : minimize 



subject to H, 



cc • e 



pp(e')lll 

= s^*'' (mod 2). 



Again, the minimization is over all M G Z>o and over all 
parity-check matrices Hcc induced by all possible AI -covers 
of the Tanner graph of Hcc- 

For the BSC with cross-over probability e, ^ e < 1/2, 
we get, with a slight abuse of notation as in Section IVII-AI 
the following speciahzed results. 



CC-LPD3 (BSC) : minimize 
subject to 

CC-LPD4 (BSC) : minimize 
subject to 



1 

M 



T7-I|e||i 



Hcc ■ e' = s^'^'' (mod 2). 



Hcc 



e'llo 
•e' = 



s^*'' (mod 2). 



CC-MLD4 (BSC) : minimize ||e'||i 

subject to -ffcc ■ e' ~ s (mod 2). 

Moreover, with a sUght abuse of notation by employing 1 1 • 1 1 q 
also for vectors over F2, CC-MLD4 (BSC) can be written as 
follows. 

CC-MLD5 (BSC) : minimize ||e'||o 

subject to Hcc ■ e' = s (mod 2). 

B. Reformulations of CC-LPD 

We start by repeating the definition of CC-LPD from 
Section HyI 



C. Reformulations of CS-OPT 

We start by repeating the definition of CS-OPT from 
Section Iml 



CS-OPT 



minimize 
subject to 



lle'llo 
Hcs ■ e' 



Clearly, this is formally very similar to CC-MLD5 (BSC). 

In order to show the tight formal relationship of CS-OPT 
with CC-MLD for general binary-input output-symmetric 

'Note that here Hqc obtained by the standai'd procedure to coiistmct a 
graph cover |8j, and not by the procedure in Definition 1271 (rf Appendix |D). 
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memoryless channels, in particular with respect to the refor- 
mulation CC-MLD3, we rewrite CS-OPT as follows. 



CS-OPTl : minimize || Isupp(e') II i 
subject to Has ■ e' = s. 



D. Reformulations of CS-LPD 

We now come to the main part of this section, namely the 
reformulation of CS-LPD in terms of graph covers. We start 
by repeating the definition of CS-LPD from Section HIH 

CS-LPD : minimize ||e'||i 

subject to Hqs ■ e! = s. 

As shown in the upcoming Theorem |22] CS-LPD can be 
rewritten as follows. 

CS-LPDl : minimize — • ||e'||i 

subject to ^cs • e' = s^^^. 

Here the minimization is over all M E Z>o and over all 
measurement matrices Hcs induced by all possible Af-covers 
of the Tanner graph of Hqs- 

Theorem 22: CS-LPDl is a reformulation of CS-LPD. 
Proof See Appendix iHl ■ 

Clearly, CS-LPDl is formally very close to CC-LPD3 
(BSC), thereby showing that graph covers can be used to 
exhibit yet another tight formal relationship between CS-LPD 
and CC-LPD. 

Nevertheless, these graph-cover based reformulations also 
highlight differences between the relaxation used in the context 
of channel coding and the relaxation used in the context of 
compressed sensing. 

• When relaxing CC-MLD to obtain CC-LPD, the cost 
function remains the same (call this property PI) but 
the domain is relaxed (call this property P2). In the 
graph-cover reformulations of CC-LPD, property PI is 
reflected by the fact that the cost function is a straightfor- 
ward generalization of the cost function for CC-MLD. 
Property P2 is reflected by the fact that in general 
there are feasible vectors in graph covers that cannot be 
explained as liftings of (convex combinations of) feasible 
vectors in the base graph and that, for suitable A-vectors, 
have strictly lower cost function values than any feasible 
vector in the base graph. 
. When relaxing CS-OPT to obtain CS-LPD, the cost 
function is changed (call this property PI'), but the 
domain remains the same (call this property P2'). In 
the graph-cover reformulations of CS-LPD, property PI' 
is reflected by the fact that the cost function is not a 



straightforward generalization of the cost function of CS- 
OPT. Property P2' is reflected by the fact that feasible 
vectors in graph covers are such that they do not yield 
cost function values that are smaller than the cost function 
value of the best feasible vector in the base graph. 

VIII. Minimizing the Zero-Infinity Operator 

For any real vector a we define the zero-infinity operator 
to be 

||a||o,oo = ||a||o • ||a||oo, 

i.e., the product of the zero norm ||a||o = | supp(a)| of a and 
of the infinity norm ||a||oo — max^ \ai\ of a. Note that for 
any c G C and any real vector a it holds that ||c • a||o,oo = 
|c| ■ ||a||o,oo- 

Based on this operator, in the present section we introduce 
CS-OPTq oo, and we show, with the help of graph covers, that 
CS-LPD can not only be seen as a relaxation of CS-OPT but 
also as a relaxation of CS-OPTq^oo- We do this by proposing 
a relaxation of CS-OPTo,oo, called CS-RELo,co, and by then 
showing that CS-RELo,oo is equivalent to CS-LPD. 

Moreover, we argue that the solution of CS-LPD is "closer" 
to the solution of CS-OPTo.oo than the solution of CS-LPD is 
to the solution of CS-OPT. Note that similar to CS-OPT, the 
problem CS-OPTo,oo is in general an intractable optimization 
problem. 

One motivation for looking for different problems whose 
relaxations equals CS-LPD is to better understand the 
"strengths" and "weaknesses" of CS-LPD. In particular, if 
CS-LPD is the relaxation of two different problems (like 
CS-OPT and CS-OPTp.oo), but these two problems yield 
different solutions, then the solution of the relaxed problem 
will disagree with the solution of at least one of the two 
problems. 

This section is structured as follows. We start by defining 
CS-OPTo,oo in Section IVIII-AI Then, in Section IVIII-BI we 
discuss some geometrical aspects of CS-OPTo,oo, in particular 
with respect to the geometry behind CS-OPT and CS-LPD. 
Finally, in Section IVIII-CI we introduce CS-RELo.oo and 
show its equivalence to CS-LPD. 

A. Definition of CS-OPTq^oo 

The optimization problem CS-OPTq co is defined as fol- 
lows. 



CS-OPTo,oo : minimize ||e'||o,oo 

subject to Hcs ■ e' = s. 



Whereas the cost function of CS-OPT, i.e., ||e'||o, measures 
the sparsity of e' but not the magnitude of the elements of e', 
the cost function of CS-OPTo.oo, i-e., ||e'||o,oo, represents a 
trade-off between measuring the sparsity of e' and measuring 
the largest magnitude of the components of e'. Clearly, in 
the same way that there are many good reasons to look for 
the vector e' that minimizes the zero-norm (among all e' that 
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Fig. 3. Unit balls for some operators. Left: {e' G R-^ | ||e'||o ^ l}. Middle: 

{e' e K2 I ||e'||o,cx) !}■ Right: {e' e | ||e'||i ^ ij. 



satisfy i?cs ■ ^ = ■s), there are also many good reasons to 
look for the vector e! that minimizes the zero-infinity operator 
(among all e! that satisfy Hc^-e' = s). In particular, the latter 
is attractive when we are looking for a sparse vector e' that 
does not have an imbalance in magnitudes between the largest 
component and the set of most important components. 

With a slight abuse of notation, we can apply the zero- 
infinity operator || • ||o.oo also to vectors over F2 and obtain 
the following reformulation of CC-MLD (BSC). (Note that for 
any vector a over F2 it holds that ||a||o,oo ~ — wii{a).) 

CC-MLD6 (BSC) : minimize ||e'||o,oo 

subject to Hoc ■ e' = s. 

This clearly shows that there is a close formal relationship 
not only between CC-MLD (BSC) and CS-OPT, but also 
between CC-MLD (BSC) and CS-OPTo,oo. 



B. Geometrical Aspects of CS-OPTq oo 

We want to discuss some geometrical aspects of CS-OPT, 
CS-OPTo,oo, and CS-LPD. Namely, as is well known, CS- 
OPT can be formulated as finding the smallest ^o-norm 
ball of radius r (cf. Figure |3] (left)) that intersects the set 
{e' I i?cs ■ e' = •s}, and in the same spirit, CS-LPD 
can be formulated as finding the smallest £i-norm ball of 
radius r (cf. Figure [3] (right)) that intersects with the set 
{e' I Hcs - el = b). Clearly, the fact that CS-OPT and 
CS-LPD can yield different solutions stems from the fact that 
these balls have different shapes. Of course, the success of 
CS-LPD is a consequence of the fact that, nevertheless, under 
suitable conditions, the solution given by the £i-norm ball is 
(nearly) the same as the solution given by the fo-norm ball. 

In the same vein, CS-OPTo,oo can be formulated as finding 
the smallest zero-infinity-operator ball of radius r (cf. Fig- 
ure [3] (middle)) that intersects the set {e' | i?cs ■ e' = s}. 
As it can be seen from Figure [51 the zero-infinity-operator 
unit ball is closer in shape to the ^i-norm unit ball than the 
£o-norm unit ball is to the ^i-norm unit ball. Therefore, we 
expect that the solution given by CS-LPD is "closer" to the 
solution given by CS-OPTo,oo than the solution of CS-LPD is 
to the solution given by CS-OPT. In that sense, CS-OPTo,oo 
is at least as justifiably as CS-OPT a difficult optimization 
problem whose solution is approximated by CS-LPD. 



C. Relaxation of CS-OPT q ^00 

In this subsection we introduce CS-RELo,oo as a relaxation 
of CS-OPTo,oo; the main result will be that CS-RELo.oo 
equals CS-LPD. Our results will be formulated in terms of 
graph covers, we therefore use the graph-cover related notation 
that was introduced in Section IVIII along with the mapping 
ipM that was defined in Section HIl 

In order to motivate the formulation of CS-RELq.oo, we 
first present a reformulation of CC-LPD (BSC). Namely, CC- 
LPD3 (BSC) or CC-LPD4 (BSC) from Section [Villi] can be 
rewritten as follows. 

CC-LPD5 (BSC) : minimize ^ • ||e'||o,oo 

subject to Hqq ■ e! = s^*^ (mod 2). 

Then, because for any vector s e Fj"^'^^ it holds that 
^Pm{s) = s if and only if s = s^'^^, CC-LPD5 (BSC) can 
also be written as follows. 

CC-LPD6 (BSC) : minimize ^ • ||e'||o.oo 

subject to Hcc ■ e' = s (mod 2) 
iPm{s) = s. 

The transition that leads from CC-MLD to its relaxation CC- 
LPD6 (BSC) inspires a relaxation of CS-OPTo.oo as follows. 











CS-RELo,oo : 


minimize 


-•lie' 


0,00 




M " 




subject to 


^cs • e' 


= s 








= s. 



Here the minimization is over all A/ G Z>o and over 
all measurement matrices Hcs induced by all possible M- 
covers of the Tanner graph of Hcs- Note that, in contrast to 
CC-LPD6 (BSC), in general the optimal solution (e, s) of 
CS-RELo,oo does not satisfy s — s^*^. 

Towards establishing the equivalence of CS-RELo,oo and 
CS-LPD, the following simple lemma will prove to be useful. 

Lemma 23: For any real vector a it holds that 

||a||i =^ l|a||o,oo, 

with equality if and only if all non-zero components of a have 
the same absolute value. 

Proof: The proof of this lemma is straightforward. ■ 

Theorem 24: Let Hcs be a measurement matrix over the 
reals with entries equal to zero, one, and minus one. For 
syndrome vectors s that have only rational components, CS- 
LPD and CS-RELo,oo are equivalent in the sense that there is 
an optimal e' in CS-LPD and an optimal e' in CS-RELo,oo 
such that e' = <^A/(e'). 

Proof: See Appendix [1] ■ 
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IX. Conclusions and Outlook 

In this paper we have established a mathematical connection 
between channel coding and compressed sensing LP relax- 
ations. The key observation, in its simplest version, was that 
points in the nullspace of a zero-one matrix (considered over 
the reals) can be mapped to points in the fundamental cone of 
the same matrix (considered as the parity-check matrix of a 
code over F2). This allowed us to show, among other results, 
that parity-check matrices of "good" channel codes can be 
used as provably "good" measurement matrices under basis 
pursuit. 

Let us comment on a variety of topics. 

• In addition to CS-LPD, a number of combinatorial al- 
gorithms {e.g. in, EB, Ea, S, Ea, m) have 
been proposed for compressed sensing problems, with 
the benefit of faster decoding complexity and comparable 
performance to CS-LPD. It would be interesting to 
investigate if the connection of sparse recovery problems 
to channel coding extends in a similar manner for these 
decoders. One example of such a clear connection is the 
bit-flipping algorithm of Sipser and Spielman |[39l and 
the corresponding algorithm for compressed sensing by 
Xu and Hassibi |3r|. Channel-coding-inspired message- 
passing decoders for compressed sensing problems were 
also recently discussed in 133, EH, E^l-fei]. 

• An interesting research direction is to use optimized 
LDPC matrices (see, e.g. ||23]| ) to create measurement 
matrices. There is a large body of channel coding work 
that could be transferable to the measurement matrix 
design problem. 

In this context, an important theoretical question is related 
to being able to certify in polynomial time that a given 
measurement matrix has "good" performance. To the 
best of our knowledge, our results form the first known 
case where girth, an efficiently checkable property, can 
be used as a certificate of goodness of a measurement 
matrix. It is possible that girth can be used to establish a 
success witness for CS-LPD directly, and this would be 
an interesting direction for future research. 

• One important research direction in compressed sensing 
involves dealing with noisy measurements. This problem 
can still be addressed with ii minimization (see, e.g., 
162)) and also with less complex signal reconstruction 
algorithms (see, e.g., l63l ). It would be very interesting to 
investigate if our nullspace connections can be extended 
to a coding theory result equivalent to noisy compressed 
sensing. 

• Beyond channel coding problems, the LP relaxation of 16] 
is a special case of a relaxation of the marginal polytope 
for general graphical models. One very interesting re- 
search direction is to explore if the connection we have 
established between CS-LPD and CC-LPD is also just a 
special case of a more general theory. 

• We have also discussed various reformulations of the 
optimization problems under investigation. This leads to 
a strengthening of the ties between some of the optimiza- 
tion problems. Moreover, we have introduced the zero- 



infinity operator optimization problem CS-OPTp oo^ ™ 
optimization problem with the property that the solution 
of CS-LPD can be considered to be at least as good 
an approximation of the solution of CS-OPTq.oo as the 
solution of CS-LPD is an approximation of the solution 
of CS-OPT. We leave it as an open question if the results 
and observations of Section IVIIII can be generalized for 
more general matrices or specific families of signals (like 
non-negative sparse signals as in li64J . 1.65 J ). 
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Appendix A 
Proof of TheoremO 

Suppose that Hqq, has the claimed nullspace property. Since 
Hqs e = s and .ffcs • e = s, it easily follows that 1/ = e — e 
is in the nullspace of Hcs- So, 

lle^lli + lle^lli = ||e||i 

(a) 

^ l|e||i 

= l|e- i^lli 

= \\es - vs\\i + lle^ - i^lli 

(h) 

^ lle^lli - Ikslli + ll^^lli - lle^lli 
(c) C - 1 

^\\^s\\i + -^-^-Mi~\\es\\i, (5) 

where step (a) follows from the fact that the solution of CS- 
LPD satisfies ||e||i ^ ||e||i, where step (b) follows from 
applying the triangle inequality property of the i?i-norm twice, 
and where step (c) follows from 

(d) C - 1 
+ ||«^||l ^ ^T^pY • lll^lll- 

Here, step (d) is a consequence of 

iC+l)-{-\\ush + \\i^h) 

- -C ■ - II1V5II1 + C • IIM5II1 + lli^lli 

S -ll'^lll - + C- + C- 

= {C-l)-\\us\\i + {C-l)-\\uj\U 
= (C-l)-|k||i, 

where step (e) follows from applying twice the fact that 1/ e 
Nullspg(iifcs) and the assumption that Hqs e NSP^ (fc, C). 
Subtracting the term Ije^Hi on both sides of (|5]l, and solving 
for ||i/||i = ||e — e||i yields the promised result. 

Appendix B 
Proof of Lemma[8] 

Without loss of generality, we can assume that the all-zero 
codeword was transmitted. Let +L > be the log-likelihood 
ratio associated with a received 0, and let —L < be the 
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log-likelihood ratio associated with a received 1. Therefore, 
Xi = +L if i G S and Xi — —L \f i £ S. Then it follows 
from the assumptions in the lemma statement that for any 
u; e /C(i?cc) \ {0} it holds that 



(a) 



(b) 



L.||a;^|li-i.||u;5||i >0=(A,O 



where step (a) follows from the fact that \u!i\ ~ Ui for all 
i S X{Hcc)^ and where step (b) follows from (|4]i. Therefore, 
under CC-LPD the all-zero codeword has the lowest cost func- 
tion value when compared to all non-zero pseudo-codewords 
in the fundamental cone, and therefore also compared to all 
non-zero pseudo-codewords in the fundamental poly tope. 

Appendix C 
Proof of Lemma[To1 



Case 1: Let 151 < h 



w, 



BSC 



(a;). The proof is by con- 



tradiction: assume that ^ This statement is 

clearly equivalent to the statement that 2 • 1 1 uj^ \\i ^ 1 1 uj^ \\i + 
= II will, which is equivalent to the statement that 
1 1 111 ^ 5 ■ ll'^lli- In terms of the notation in Definition |9] 
this means that 



w. 



BSC 



(b) 

s$ 2 



^s\i 



1 



2 



(a) 

< 2 • i^" 
S\-\\u:l 



2-151 



where at step (a) we have used the fact that F^^ is a (strictly) 
non-decreasing function and where at step (b) we have used 
the fact that the slope of F~^ (over the domain where F^^ is 
defined) is at least I/HcliHoo- The obtained inequality, however, 
is a contradiction to the assumption that |5| < | • Wp^^(a;). 

Case 2: Let \S\ < \ • Wp^^ (w). The proof is by contradic- 
tion: assume that Hw^Hi ^ Hw^jHi. Then, using the definition 
of u)' based on uj (cf. Section HV-Cb . we obtain 



•'{l,...,|5|} 
„BSC' 



1 ^ ll'^slll ^ W^^sh > 11^151 + 1.. ..,n}l|l- 



If {(jj) is an even integer, then the above line of inequal- 

ities shows that |5| ^ i • Wp^*^ (a;), which is a contradiction 



to the assumption that |5| < i 



,,BSC' 



(w). If (w) is 



an odd integer, then the above line of inequalities shows that 



151 ^ 



,BSC' 



(a;), which again is a 



contradiction to the assumption that |5| < ^ 



„BSC' 



Appendix D 
Extensions of the Bridge Lemma 

The aim of this appendix is to extend Lemma [TT] (cf. Sec- 
tion |V]i to measurement matrices beyond zero-one matrices. In 
that vein we will present three generalizations in Lemmas |25] 
l29l and|3T| Note that the setup in this appendix will be slightly 
more general than the compressed sensing setup in Section Hill 
(and in most of the rest of this paper). In particular, we allow 
matrices and vectors to be over C, and not just over R. 

We will need some additional notation. Namely, similarly 
to the way that we have extended the absolute value operator 



I • I from scalars to vectors at the beginning of Section |Vl we 
will now extend its use from scalars to matrices. 

Moreover, we let | • | ^ be an arbitrary norm for the complex 
numbers. As such, | • |^ satisfies for any a, 6, c G C the triangle 
inequality |a + fe|^ ^ |a|^ + |6|^ and the equality |c-a|^ = 
|c| • |a|^. In the same way the absolute value operator | • | was 
extended from scalars to vectors and matrices, we extend the 
norm operator | • |^ from scalars to vectors and matrices. 

We let II • II ^ be an arbitrary vector norm for complex vectors 
that reduces to | • | ^ for vectors with one component. As such, 

II • 11^ satisfies for any c € C and any complex vectors a and 
b with the same number of components the triangle inequality 
l|a + ^IL ^ l|a|L + ||b||^ and the equality II c- all ^ = |cM|a|L. 

We are now ready to discuss our first extension of 
Lemma [TT] which generalizes the setup of that lemma from 
real measurement matrices where every entry is equal to 
either zero or one to complex measurement matrices where 
the absolute value of every entry is equal to either zero 
or one. Note that the upcoming lemma also generaUzes the 
mapping that is applied to the vectors in the nullspace of the 
measurement matrix. 



Lemma 25: Let H( 



cs 



be a measurement matrix 



over C such that \hjj\ g {0,1} for all (j, i) € J{Hcs) 
I{Hcs)^ and let | • |^ be an arbitrary norm on C. Then 



u e Nullspc(if, 



\uleIC{\Hcs\). 



Remark: Note that supp{u) — supp(li'l^). 

Proof: Let u) = |i/|^. In order to show that such a vector 
LJ is indeed in the fundamental cone of |J?cs|i we need to 
verify and (|3]l. The way a; is defined, it is clear that 
it satisfies (|2|i. Therefore, let us focus on the proof that 
satisfies (O. Namely, from u G Nullspc(ffcs) it follows that 
for all j e J^, — 0- Po'" all j ^ J and all i £ X, 

this implies that 



i' ^X\i i' 

= J2 

i' GTj \i 

showing that lj indeed satisfies (|3]l. 
Example 26: The measurement matrix 

1 ^(1 + .; 



i'el\i 

i' ^Xj\i 



Hcs — 



-1 i 



1 



satisfies 



csl 



1 1 
1 1 1 



and so Lemma [25] is applicable. An example of a vector in 

Nullspc(i?cs) is 
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Choosing | • 



I, we obtain 



2 + V2, 1 =(1, 1.848..., 1) e /C(|/fcs|). 



□ 

The second extension of Lemma [TTI generahzes that lemma 
to hold also for complex measurement matrices where the 
absolute value of every entry is an integer In order to 
present this lemma, we need the following definition, which 
is subsequently illustrated by Example [28] 

Definition 27: Let Hqs = ^e a measurement 

matrix over C such that \hj^i\ e Z^o for all (j, i) G J'{Hcs) x 
I{Hcs)^ and let M e Z>o be such that M ^ max(j 
We define an Af-fold cover Hqs of Hqs as follows: for 
(j: *) G J'{Hcs) X ^{Hcs)^ if the scalar hj^i is non-zero 
then it is replaced by a matrix, namely hj^i/\hj^i\ times the 
sum of \hj^i\ arbitrary M x M permutation matrices with non- 
overlapping support. However, if hj^i = then the scalar hj^i 
is replaced by an all-zero matrix of size M x M. □ 

Note that all entries of the matrix Hcs in Definition |22] 
have absolute value equal to either zero or one. 

Example 28: Let 



„ A / 1 72(1 + z) 
^^^=1-2 ^ 3 



Clearly 



\Hcs\ 



1 2 

2 1 3 



and so, choosing M = 3 and 



Hcs — 



1 

1 
1 







1+i 1+i n 
1+i n 1+i 

" ^/2 ^/2 


-1 -1 
-1 -1 
-1 -1 


i 
i 
i 


1 1 1 
1 1 1 
1 1 1 



we obtain a matrix described by the procedure of Defini- 
tion [27] □ 
Lemma 29: Let Hqs = be a measurement matrix 

over C such that \hj_i\ e Z^o for all £ J{Hcs) x 

I(i?cs). Let M e Z>o be such that M ^ max(j j) |/ij,i|, 
and let Ifcs be a matrix obtained by the procedure in 
Definition |27] Moreover, let | • | ^ be an arbitrary norm on C. 
Then 



u e Nullspc(Ifcs) 



1^^^' e Nullspc(ffcs) 
y^'lelC{\Hcs\)- 



Additionally, with respect to the first implication sign we have 
the following converse: for any £> G C*^" we have 



ipM{i>) e Nullspc(i?cs) 



i> e Nullspc(i?cs). 



Proof: Let Hcs = (/j(j,m'),(i,m))(j,m'),(i,m). Note that 
by the construction in Definition |27l it holds that 

X! ^(j,m'),(»,m) = for any {j,i,m) e Jxlx [M], 

m'e[A/] 

X! ^a,m'),(j,m) = hj^, for any (j, m', i) £ Jx [M] xl. 

me [A/] 

Let 1/ e Nullspc(i?cs). Then, for every {j,m') £ J x [M] 
we have 

(i,m)eIx[A/] (i,m)eIx[Af] 

ie[A./] 



iei 



iei 



where the last equality follows from the assumption that f £ 
NullspcC-ffcs). Therefore u"^^'' £ Nullspcl-ffcs). Because 
\h{j,ni'),{i,m) \ e {0, 1} for all {j,m',i,m) £ J x [M] xlx 
[M], we can then apply Lemma |25] to conclude that l^*^*^!^ £ 

^\Hcs\)- 

Now, in order to prove the last part of the lemma, assume 
that D £ Nullsp(-(i/cs) and define v — (pM{i')- Then for 
every j £ we have 



iel 



iel me [A/] 

[7 H 5Z ^^-'i ■ ^(^'") 

ei me [Af] 



M 



M 



iel me[Af] m'e[Af] 



m'e[Af] \ieIme[Af] J 

= 0, 

where the last equality follows from the assumption that 
9 £ Nullspc(Ja'cs), i-e-, for every (j,m') £ J x [M] 
the expression in parentheses equals zero. Therefore, u = 
VAf G Nullspc(i?cs). ■ 

Example 30: Consider the measurement matrix Hcs of 
Example |28] A possible vector in Nullspc(J?cs) is given by 



{y2{i + i), 2V2 - i{3 + 2V2) , -1^ 

Applying Lemma |29l with M = 3 and | • |^ ^ | • |, we obtain 

(2,2,2,a,a,a, 1, 1, 1) € /C(|^cs|) , 



,t3| 



where a — \/25 + 12\/2 = 6.478... , and where ^cs can be 
chosen as in Example |28] □ 
Our third extension of Lemma [TT] generalizes the mapping 
that is applied to the vectors in the nullspace of the measure- 
ment matrix. 

Lemma 31: Let Hcs — be a measurement matrix 

over C such that \hj,i\ £ {0,1} for all (j, i) £ J{Hcs) x 
I{Hcs)- Let L £ Z>o, let || • ||^ be an arbitrary norm for 
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complex vectors, and let {i^'^^^eelL] be a collection of vectors 
with n components. Then 

e Nullspc(Hcs) => u:elC{\Hcs\), 
where uj E R" is defined such that for all i E I{Hcs), 



( (1) 



Proof: The proof is very similar to the proof of 
Lemma |25] Namely, in order to show that u> is indeed 
in the fundamental cone of l-ffcsl^ we need to verify (|2]l 
and (|3]l. The way a; is defined, it is clear that it satisfies (|2|l. 
Therefore, let us focus on the proof that u) satisfies (O. 
Namely, from i/^^^ E Nullspc(-H'cs), ^ G [i], it follows that 
Y.i<^x^3,ivf - 0, j e J^, ^ e [L]. For all j ^ J and all 
i E X, this implies that 



/ (1) 



(1) 



'ex\i 

^ E II '^^■■^ 



i'el\i 



-E ^^'^ 

i'el\i 



4^ 



= Ei/.,.i-||(4^.-.,4'0 

i'ei\i 

E\\( (1) (L) 

= E 

showing that u: indeed satisfies (|3]l. 



Corollary 32: Consider the setup of Lemma |3T| Let L E 
Z>o, and select L arbitrary scalars a^^-' E M^o, ^ G [L], and 
L arbitrary vectors v> 



For 



For 



we have 



5]aW|i.W|e/C(|Jfcs|). 

ee[L] 

• 1 1 2 we have 



/^(aW)2|i.W|2e/C(|ffcs|), 

where the square root and the square of a vector are 
understood component-wise. 

Proof: These are straightforward consequences of apply- 
ing Lemma [3l] to {a^^' • J^'^'}^,^^^]- ■ 



Because IC{\Hcs\) is a convex cone, the first statement 
in Corollary [32] can also be proven by combining E 
JC ( I ifcs I ) > ^ G [-^] ' with the fact that any conic combination of 
vectors in /C(|H'cs|) is a vector in /C(|-ffcs|)- In that respect, 
the second statement of Corollary[32]is noteworthy in the sense 
that although L vectors in /C(|Jfcs|) are combined in a "non- 
conic" way, we nevertheless obtain a vector in /C(|ffcs|)- 
(Of course, for the latter to work it is important that these L 
vectors are not arbitrary vectors in /C(|i?cs|) but that they 
are derived from vectors in the C-nullspace of Hqs-) 

We conclude this appendix with two remarks. First, it is 
clear that Lemma |3T] can be extended in the same way as 
Lemma |29] extends Lemma |25] Second, although most of 
Section |VI] is devoted to using Lemma [TT| for translating 
"positive results" about CC-LPD to "positive results" about 
CS-LPD , it is clear that Lemmas |25] |29] and [3T] can equally 
well be the basis for translating results from CC-LPD to CS- 
LPD. 

Appendix E 
Proof of Theorem[T3] 

By definition, e is the original signal. Since Hcs ■ e = s 
and Hcs • e = s, it easily follows that i/ = e — e is in the 
nullspace of Hqs- So, 



e^lli 



e^-lli = ||e||i 

(a) 

^llelli 



(6) 



= ||e5-iy5||i + ||e^-i/;5|li 

+ \\i^sh - lle^lli 



^ 65 1 



(c) 

> \\es\ 



^-2Vk)\\i' 



(7) 
(8) 



where step (a) follows from the fact that the solution of CS- 
LPD satisfies ||e||i ^ ||e||i and where step (b) follows from 
applying the triangle inequality property of the ^i-norm twice. 
Moreover, step (c) follows from 



|l^5||l 



= Mi-2\\us\ 



/C 



I2-21I1/5II1 

I2 - 2^/k\\vs\ 
I2 - 2^/k\\u\\2 



^111 - 

(d) 
(e) 
(f) 



where step (d) follows from the assumption that 
y^AWGNC(|j^|) ^ (j^ holds for all u E Nu11spr(J?cs) \ {0}, 
i.e., that ^ VC'- \\v\\2 holds for all f E NullspR(i?cs), 
where step (e) follows from the inequality ||a||i ^ Vk- \\a\\2 
that holds for any real vector a with k components, and 
where step (f) follows from the inequality ||a5||2 11^*11 2 
that holds for any real vector a whose set of coordinate 
indices includes S. Subtracting the term He^Hi on both sides 
of ©-(ISll, and solving for ||i'||2 = ||e — e||2, we obtain the 
claim. 
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Appendix F 
Proof of Theorem[T41 

By definition, e is the original signal. Since Hqs ■ e = s 
and Hcs ■ e — s, it easily follows that = e — e is in the 
nullspace of Hcs- So, 



leslli + lle^lli-llelli 

(a) 

^ lle^lli - ||i^5||i + ll^^lli 
|'||e5||i + (C"-2fc).||«^||, 



(9) 



-lle^lli' (10) 



where step (a) follows from the same line of reasoning as in 
going from ^ to O, and where step (b) follows from 

-||'^5||l + ||l^||l = ||l^||l-2-||lV5||l 

||iv||oo-2- lliy^lli 

(d) , 

^ C ■ ||/^||oo-2fc- ||/^5||oo 

= (C"-2fc). ||iv||oo, 

where step (c) follows from the assumption that 
Wmax-frac(li^l) > C holds for all V G NullspR(i?cs) \ {0}, 
i.e., \\v\\i ^ C ■ ||iv||oo holds for all v e NullspR(i?cs), 
where step (d) follows from the inequality ||a||i ^ k ■ ||a||oo 
that holds for any real vector a with k components, and where 
step (e) follows the inequality Ha^Hoo ^ lla^Hoo that holds for 
any real vector a whose set of coordinate indices includes S. 
Subtracting the term He^Hi on both sides of (|9]l-(fT0ll, and 
solving for Hi/Hoo^He — e||ooWe obtain the claim. 



Appendix G 
Proof of Theorem[2T] 

In a first step, we discuss the reformulation of the cost func- 
tion. Namely, for arbitrary x' £ Cqc^ let e' = y — x' (mod 2), 
i.e., x'^ — y^ — e[ — + e[ (mod 2) for all i £ I. Then 

(a) 



^A,y, + ^A,.(l-2y,).e: 



iex 



(b) 



(11) 



iex 



where at step (a) we used the fact that for a,b £ {0, 1}, the 
result of a + 6 (mod 2) can be written over the reals as a + 
b — 2ab, and at step (b) we used the fact that for all i E I, 
Xi ■ (1 — 2yj) = |Ai|. Notice that the first sum in the last line 
of ( fTTI ) is only a function of y, hence minimizing (A, x') = 
J2iex ^^■^i ^' 1^ equivalent to minimizing — 
(|A|,e') = ||A3upp(e,)||i over e'. 

In a second step, we discuss the reformulation of the 
constraint. Namely, for arbitrary x' € Ccc> ™d corresponding 
e' = y — x' (mod 2), we have -ffcc ■ e' = -f^cc ■ iv ~ — 
Hcc ■ y - Hcc ■ x' = Hcc • y - = s (mod 2). 



Appendix H 
Proof of Theorem|22] 

Because for M — 1 the measurement matrix Hcs equals 
the measurement matrix J?cs^ it is clear that any feasible 
vector of CS-LPD yields a feasible vector of CS-LPDl. 

Therefore, let us show that for Af > 1 no feasible vector of 
CS-LPDl yields a smaller cost function value than the cost 
function value of the best feasible vector in the base Tanner 
graph. To that end, we demonstrate that for any M G Z>o, any 
A/-cover based Hcs, and any e' with Hcs ■ e' — s^*^, the 
cost function value of e' is never smaller than the cost function 
value of the feasible vector in the base Tanner graph given 
by the projection y3jv/(e'). Indeed, the cost function value of 
(fAiie') is 



iei me[M] 



M 

iei me [M] 



e 1: 



iei me[M] 



i.e., it is never larger than the cost function value of e'. More- 
over, since Hcs ■ e' = s^*^ implies that Hcs ■ ¥'A/(e') = s, 
we have proven the claim that ipM{e') 
in the base Tanner graph. 



s is a feasible vector 



Appendix I 
Proof of Theorem|24] 

The proof has two parts. First we show that the minimal 
cost function value of CS-RELo.oo is never smaller than the 
minimal cost function value of CS-LPD. Second, we show that 
for any vector that minimizes the cost function of CS-LPD 
there is a graph cover and a configuration therein whose zero- 
infinity operator equals the minimal cost function value of 
CS-LPD. 

We prove the first part. Let e' minimize ||e'||i over all e' 
such that Hcs ■ e' = s. For any AI e Z>o, any Hcs whose 
Tanner graph is an i\/-cover of the Tanner graph of Hcs, and 
any (e', s) with Hcs ■ e' = s and ipm{S) = s, it holds that 



si 



O.oo 



(a) 



.7' 



(b) , (c) 

^ ||<^Af(e')||i > 



where step (a) follows from Lemma |23l where step (b) uses 
the same line of reasoning as the proof of Theorem |22] and 
where step (c) follows from the easily verified fact that Hcs • 
ipi\i{e') — s, along with the definition of e'. Because (e',s) 
was arbitrary (subject to .ffcs • e' = s and (Pm{s) = s), this 
observation concludes the first part of the proof. 

We now prove the second part. Again, let e' minimize 
||e'||i over all e' such that Hcs ■ e' = s. Once CS-LPD 
is rewritten as a linear program (with the help of suitable 
auxiliary variables), we see that the coefficients that appear 
in this linear program are all rationals. Using Cramer's rule 
for determinants, it follows that the set of feasible points of 
this linear program is a polyhedral set whose vertices are all 
vectors with rational entries. Therefore, if e' is unique then e' 
is a vector with rational entries. If e' is not unique then there 
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is at least one vector e! with rational entries that minimizes 
the cost function of CS-LPD. Let e! be such a vector. 

Before continuing, let us simplify the notation slightly. 
Namely, we rearrange the constraint i/cs • e' = s in CS-LPD 
so that it reads 



= 0, 



(12) 



and then we replace (fT2] ) by 

Jfcs e' = 0. 

This is done by redefining J?cs to stand for [Hcsi ^-f)> 
and redefining e' to stand for ((e')^, (s)^) . Note that the 
redefined i?cs contains zeros, ones, or minus ones. Similarly, 
we rearrange the constraint Has ■ e' = s in CS-RELo,oo so 
that it reads 



-0, 



(13) 



(Hcs -I) 

and then we replace ( fT3] l by 

Hcs • e' = 0. 

This is done by redefining Hqs to stand for [Hcs, 
and redefining e' to stand for ((e')^, (s)^) . Note that the 
redefined Hcs contains only zeros, ones, or minus ones, and 
that the Tanner graph representing the redefined Hcs is a valid 
M-fold cover of the Tanner graph representing the redefined 
Hcs- 

We will now exhibit a suitable M-fold cover and a config- 
uration e' therein such that (pM{e') = e' and such that for 
some 7 e M>o the vector e' will satisfy 



{0, +7} if e^- > 
e < {0} ife^-^O, {i,m) el X [M]. (14) 
{0,-7} ife^<0 
Then for such a vector the following holds 

1 II -ni (a) 1 II ~/|| (b) II /-/Ml (c) II /I 

j^\\e llo.oo = Jj\\^h = IIVA^e )||i = ||e ||i, 

where step (a) follows from the fact that the equality condition 
in Lemma|23]is satisfied, step (b) follows from the fact that for 
every i e I, all {e(-_^)}mg[M], g;. ^,#0 have the same sign, 
and step (c) follows from (pM(e') — e! . 

Towards constructing such a graph cover and a vector e', we 
make the following observations. Namely, fix some d E Z>o 
and some hi E { — 1, +1}, i S [d], and consider the hyperplane 



A 



a E 



hitti = 



Let a* E A he a vector with all its coordinates satisfying 

-1 ^ a* < +1, i E [d\. Let A^ be the set 



A°^ 



a E 



a^ E [0,+l] 

a,; = 
a, E hl,0] 



if a* > 
if a* = 
if a* < 



which is a box around a* whose vertices have only integer 
coordinates. 



Consider now the set A* =AnA^, and let A' be the set 
of vertices of A* . The set A* is a polytope and, interestingly, 
it can be verified that the set of vertices of A* is a subset of 
the set of vertices of A'^, i.e., all the points in A' have integer 
coordinates. Because a* E A*, this vector can be written as 
a convex combination of the vertices of A*, i.e., there are 
non-negative real numbers {Pa'}^,^^^, with Y^^'eA' Pa' ^ 1 
such that a* = X^o'e^t' Pa'd' . Note that for all i E [d] the 
following holds: if a* > then ^ for all a' E A' , if 
a* < then a- < for all a' E A', and if a* = then = 
for all a' eA'. 

We now define = maxigx|e^| and apply the above 
observations to our setup, in particular to the vector e'//i, 
whose coordinates are rational numbers lying between —1 and 
+1 inclusive. Namely, for every j E J, we have X^iei ^j,* ' 
(e^//i) = with hj,i E { — 1,+!}, i E Ij, and so there is a 
set and non-negative rational numbers {/^j.o' j^j/g^/ with 

Ea'e^' /3j,a' = 1, such that e'/^i^ Hrei Pj,o.'< holds, 

J J J J J 

where eU- is the vector e restricted to the coordinates indexed 
by the set Xj . Note that the set A'^ is such that for aU z G Ij the 
following holds: if e'^ > then a' E {0, +1} for all a' E A'^, 
if e^; < then E {-1,0} for all a' E A'j, and if = 
then a' = for all a' E A',. 

Let fi' be the largest positive real number such that e^//i' E 
Z for all i e I and such that f3j^a'./fJ'' G ^ for all j E X, 

We are now ready to construct the promised M -fold cover 
of the base Tanner graph and the valid configuration e'. We 
choose AI ~ ji/ ji' (clearly, M E Z>o), and so the constructed 
e' will need to have the properties shown in (fT4l i with 7 = 
/i/M — /i'. Without going into the details, the A/-fold cover 
with vaUd configuration e' can be obtained with the help of the 
above {Pj.a'.}jeJ.a'.eA'- values by using a construction that 
is very similar to the explicit graph cover construction in [jS] 
Appendix A.l]. For example, for every i E X with e[ > 
we set M ■ (e'Jn) = e'J ^i' of the values in {e[,^„i)} „,elM] 
equal to 7, and we set A/ • (1 — ej//i) = M — e[/ ^' of the 
values in {e'ji „j)}me[j\/] equal to 0, etc.. Similarly, for every 
j E J and a'^ E Aj we set the local configuration of M ■ 
iPj.a'./fJ-) = Pj,a'.llJ-' out of the M copies of the j-th check 
node equal to a'y Finally, the edges between the variable and 
the constraint nodes of the Af-fold cover of the base Tanner 
graph are suitably defined. (Note that the definition of the 
matrix in (fTsT i implies that the edge connections in the part 
of the graph cover corresponding to the right-hand side of the 
matrix have already been pre-selected. However, this is not a 
problem because the variable nodes associated with this part of 
the matrix have degree one and because the above-mentioned 
constraint node assignments can always be chosen suitably.) 

This concludes the second part of the proof. 
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